Machine-Learning for Spammer Detection in Crowd-Sourcing
نویسندگان
چکیده
Over a series of evaluation experiments conducted using naive judges recruited and managed via Amazon’s Mechanical Turk facility using a task from information retrieval (IR), we show that a SVM shows itself to have a very high accuracy when the machine-learner is trained and tested on a single task and that the method was portable from more complex tasks to simpler tasks, but not vice versa.
منابع مشابه
Active Learning and Crowd-Sourcing for Machine Translation
In recent years, corpus based approaches to machine translation have become predominant, with Statistical Machine Translation (SMT) being the most actively progressing area. Success of these approaches depends on the availability of parallel corpora. In this paper we propose Active Crowd Translation (ACT), a new paradigm where active learning and crowd-sourcing come together to enable automatic...
متن کاملScaling Up Crowd-Sourcing to Very Large Datasets: A Case for Active Learning
Crowd-sourcing has become a popular means of acquiring labeled data for many tasks where humans are more accurate than computers, such as image tagging, entity resolution, and sentiment analysis. However, due to the time and cost of human labor, solutions that rely solely on crowd-sourcing are oen limited to small datasets (i.e., a few thousand items). is paper proposes algorithms for integrat...
متن کاملActive Learning for Crowd-Sourced Databases
Crowd-sourcing has become a popular means of acquiring labeled data for many tasks where humans are more accurate than computers, such as image tagging, entity resolution, or sentiment analysis. However, due to the time and cost of human labor, solutions that solely rely on crowd-sourcing are often limited to small datasets (i.e., a few thousand items). This paper proposes algorithms for integr...
متن کاملAnnotating biomedical ontology terms in electronic health records using crowd-sourcing
Electronic health records have been adopted by many institutions and constitute an important source of biomedical information. Text mining methods can be applied to this type of information to automatically extract useful knowledge. We propose a crowd-sourcing pipeline to improve the precision of extraction and normalization of biomedical terms. Although crowd-sourcing has been applied in other...
متن کاملCrowd-Sourced AI Authoring with ENIGMA
ENIGMA is an experimental platform for collaborative authoring of the behaviour of autonomous virtual characters in interactive narrative applications. The main objective of this system is to overcome the bottleneck of knowledge acquisition that exists in generative storytelling systems through a combination of crowd-sourcing and machine learning. While the authoring front-end of the applicatio...
متن کامل